多年来,为各种对象检测任务开发了数据集。海事域中的对象检测对于船舶的安全和导航至关重要。但是,在海事域中,仍然缺乏公开可用的大规模数据集。为了克服这一挑战,我们提出了Kolomverse,这是一个开放的大型图像数据集,可在Kriso(韩国研究所和海洋工程研究所)的海事域中进行物体检测。我们收集了从韩国21个领土水域捕获的5,845小时的视频数据。通过精心设计的数据质量评估过程,我们从视频数据中收集了大约2,151,470 4K分辨率的图像。该数据集考虑了各种环境:天气,时间,照明,遮挡,观点,背景,风速和可见性。 Kolomverse由五个类(船,浮标,渔网浮标,灯塔和风电场)组成,用于海上对象检测。该数据集的图像为3840美元$ \ times $ 2160像素,据我们所知,它是迄今为止最大的公开数据集,用于海上域中的对象检测。我们进行了对象检测实验,并在几个预训练的最先进的架构上评估了我们的数据集,以显示我们数据集的有效性和实用性。该数据集可在:\ url {https://github.com/maritimedataset/kolomverse}中获得。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years. In addition to learning attention in time domain, recent works also explore learning attention in frequency domains (e.g., Fourier domain, wavelet domain), given that seasonal patterns can be better captured in these domains. In this work, we seek to understand the relationships between attention models in different time and frequency domains. Theoretically, we show that attention models in different domains are equivalent under linear conditions (i.e., linear kernel to attention scores). Empirically, we analyze how attention models of different domains show different behaviors through various synthetic experiments with seasonality, trend and noise, with emphasis on the role of softmax operation therein. Both these theoretical and empirical analyses motivate us to propose a new method: TDformer (Trend Decomposition Transformer), that first applies seasonal-trend decomposition, and then additively combines an MLP which predicts the trend component with Fourier attention which predicts the seasonal component to obtain the final prediction. Extensive experiments on benchmark time-series forecasting datasets demonstrate that TDformer achieves state-of-the-art performance against existing attention-based models.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Accurately extracting driving events is the way to maximize computational efficiency and anomaly detection performance in the tire frictional nose-based anomaly detection task. This study proposes a concise and highly useful method for improving the precision of the event extraction that is hindered by extra noise such as wind noise, which is difficult to characterize clearly due to its randomness. The core of the proposed method is based on the identification of the road friction sound corresponding to the frequency of interest and removing the opposite characteristics with several frequency filters. Our method enables precision maximization of driving event extraction while improving anomaly detection performance by an average of 8.506%. Therefore, we conclude our method is a practical solution suitable for road surface anomaly detection purposes in outdoor edge computing environments.
translated by 谷歌翻译
与常规的闭合设定识别相反,开放式识别(OSR)假设存在未知类别,在训练过程中未被视为模型。 OSR中的一种主要方法是度量学习,其中对模型进行了训练以分离已知类别数据的类间表示。 OSR中的许多作品报告说,即使模型仅通过已知类别的数据进行培训,模型也会意识到未知数,并学会将未知类表征与已知类别表示分开。本文通过观察雅各布的代表规范来分析这种新兴现象。从理论上讲,我们表明已知集中的阶层内距离最小化会减少已知类表征的雅各布式规范,同时最大化已知集合中的阶层间距离会增加未知类别的雅各布式规范。因此,封闭式度量学习通过迫使其雅各布规范值有所不同,从而将未知的未知数与已知分开。我们通过使用标准OSR数据集的大量证据来验证我们的理论框架。此外,在我们的理论框架下,我们解释了标准的深度学习技术如何有助于OSR并将框架作为指导原则来开发有效的OSR模型。
translated by 谷歌翻译
本文介绍了一个分散的多代理轨迹计划(MATP)算法,该算法保证在有限的沟通范围内在障碍物丰富的环境中生成安全,无僵硬的轨迹。所提出的算法利用基于网格的多代理路径计划(MAPP)算法进行僵局,我们引入了子目标优化方法,使代理会收敛到从MAPP生成的无僵局生成的路点。此外,提出的算法通过采用线性安全走廊(LSC)来确保优化问题和避免碰撞的可行性。我们验证所提出的算法不会在随机森林和密集的迷宫中造成僵局,而不论沟通范围如何,并且在飞行时间和距离方面的表现都优于我们以前的工作。我们通过使用十个四肢的硬件演示来验证提出的算法。
translated by 谷歌翻译
ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列,该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战,这是由于探测器的几何形状,不均匀的散射和冰中光的吸收,并且低于100 GEV的光,每个事件产生的信号光子数量相对较少。为了应对这一挑战,可以将ICECUBE事件表示为点云图形,并将图形神经网络(GNN)作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开,对不同的中微子事件类型进行分类,并重建沉积的能量,方向和相互作用顶点。基于仿真,我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术,包括已知系统不确定性的影响。对于中微子事件分类,与当前的IceCube方法相比,GNN以固定的假阳性速率(FPR)提高了信号效率的18%。另外,GNN在固定信号效率下将FPR的降低超过8(低于半百分比)。对于能源,方向和相互作用顶点的重建,与当前最大似然技术相比,分辨率平均提高了13%-20%。当在GPU上运行时,GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件,这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。
translated by 谷歌翻译
特征相似性匹配将参考框架的信息传输到查询框架,是半监视视频对象分割中的关键组件。如果采用了汇总匹配,则背景干扰器很容易出现并降低性能。徒匹配机制试图通过限制要传输到查询框架的信息的量来防止这种情况,但是有两个局限性:1)由于在测试时转换为两种匹配,因此无法完全利用过滤匹配的匹配; 2)搜索最佳超参数需要测试时间手动调整。为了在确保可靠的信息传输的同时克服这些局限性,我们引入了均衡的匹配机制。为了防止参考框架信息过于引用,通过简单地将SoftMax操作与查询一起应用SoftMax操作,对查询框架的潜在贡献得到了均等。在公共基准数据集上,我们提出的方法与最先进的方法达到了可比的性能。
translated by 谷歌翻译
本文介绍了持续的Weisfeiler-Lehman随机步行方案(缩写为PWLR),用于图形表示,这是一个新型的数学框架,可生成具有离散和连续节点特征的图形的可解释的低维表示。提出的方案有效地结合了归一化的Weisfeiler-Lehman程序,在图形上随机行走以及持续的同源性。因此,我们整合了图形的三个不同属性,即局部拓扑特征,节点度和全局拓扑不变,同时保留图形扰动的稳定性。这概括了Weisfeiler-Lehman过程的许多变体,这些变体主要用于嵌入具有离散节点标签的图形。经验结果表明,可以有效地利用这些表示形式与最新的技术产生可比较的结果,以分类具有离散节点标签的图形,并在对具有连续节点特征的人分类中增强性能。
translated by 谷歌翻译